A Self-Adaptive Reinforcement-Exploration Q-Learning Algorithm
نویسندگان
چکیده
Directing at various problems of the traditional Q-Learning algorithm, such as heavy repetition and disequilibrium explorations, reinforcement-exploration strategy was used to replace decayed ?-greedy in thus a novel self-adaptive (SARE-Q) algorithm proposed. First, concept behavior utility trace introduced proposed probability for each action be chosen adjusted according trace, so improve efficiency exploration. Second, attenuation process exploration factor ? designed into two phases, where first phase centered on second one transited focus from utilization, rate dynamically success rate. Finally, by establishing list state access times, current is adaptively number times accessed. The symmetric grid map environment established via OpenAI Gym platform carry out symmetrical simulation experiments (SA-Q) SARE-Q algorithm. experimental results show that has obvious advantages over algorithms average turning inside rate, with shortest planned route.
منابع مشابه
Self-Regulating Action Exploration in Reinforcement Learning
The basic tenet of a learning process is for an agent to learn for only as much and as long as it is necessary. With reinforcement learning, the learning process is divided between exploration and exploitation. Given the complexity of the problem domain and the randomness of the learning process, the exact duration of the reinforcement learning process can never be known with certainty. Using a...
متن کاملAdaptive Aggregation for Reinforcement Learning with Efficient Exploration: Deterministic Domains
We propose a model-based learning algorithm, the Adaptive Aggregation Algorithm (AAA), that aims to solve the online, continuous state space reinforcement learning problem in a deterministic domain. The proposed algorithm uses an adaptive state aggregation approach, going from coarse to fine grids over the state space, which enables to use finer resolution in the “important” areas of the state ...
متن کاملAdaptive-Resolution Reinforcement Learning with Efficient Exploration in Deterministic Domains∗
We propose a model-based learning algorithm, the Adaptive-resolution Reinforcement Learning (ARL) algorithm, that aims to solve the online, continuous state space reinforcement learning problem in a deterministic domain. Our goal is to combine adaptive-resolution approximation scheme with efficient exploration in order to obtain fast (polynomial) learning rates. The proposed algorithm uses an a...
متن کاملRTP-Q: A Reinforcement Learning System with Time Constraints Exploration Planning for Accelerating the Learning Rate
Reinforcement learning is an efficient method for solving Markov Decision Processes that an agent improves its performance by using scalar reward values with higher capability of reactive and adaptive behaviors. Q-learning is a representative reinforcement learning method which is guaranteed to obtain an optimal policy but needs numerous trials to achieve it. k-Certainty Exploration Learning Sy...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2021
ISSN: ['0865-4824', '2226-1877']
DOI: https://doi.org/10.3390/sym13061057